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Abstract 

In  this  paper,  we  derive  and  generalize  the  methods  of  Buneman 
for  solving  elliptic  partial  difference  equations  in  a  rectangular 
region.  We  show  why  the  Buneman  methods  lead  to  numerically  accurate 
solutions  whereas  the  CORF  algorithm  may  be  numerically  unstable. 
Several  numerical  examples  are  given  and  discussed. 


Introduction 


In  the  first  part  of  this  report,  we  described  several  direct 
methods  for  solving  linear  equations  arising  from  elliptic  partial 
difference  equations.  In  this  part,  we  develop  the  Buneman  algorithms 
which  are  closely  related  to  the  Cyclic  Odd/Even  Reduction  and 
Factorization  (CORF)  algorithm  which  was  derived  in  the  first  part. 

We  then  show  why  the  CORF  algorithm  is  numerically  unstable  whereas 
the  Buneman  algorithms  yield  numerically  accurate  results.  Finally, 
we  describe  some  numerical  examples  and  compare  the  time  and  accuracy 
of  several  methods  for  solving  them. 


10.  Accuracy  of  the  CORF  algorithm 

As  will  be  shown  in  Section  11,  the  CORF  algorithm  and  the  Buneman 
algorithms  are  mathematically  identical.  The  difference  between  the 
methods  lies  in  the  way  the  right  hand  side  is  calculated  at  each 
stage  of  the  reduction.  To  the  authors’  knowledge,  this  is  the  only 
direct  method  for  solving  linear  equations  in  which  the  right  hand  side 
of  the  equations  plays  an  important  r&le  in  the  numerical  solution  of 
the  equations.  In  this  section,  we  show  the  difficulties  encountered 


in  using  the  CORF  algorithm.  In  Section  13,  we  will  prove  the  stability 
of  the  Buneman  algorithms. 

( iO  ( 

Recall  from  Section  3  that  it  is  possible  to  compute  Av  'yj 
by  the  following  algorithm: 


3b  *  *^ir>  '  3i  =  ^ 


(10.1) 


3b 


-  ~A3s-1  ~  T  3s -2  f0r  S  =  2>3,--.>2 


so  that 


.  (r)  (r) 

1  r  =  A  y 

~2 


Because  of  roundoff  error,  one  actually  computes  the  sequence 


3b  -  •  3i  ■  *Jr)  +  5o 


3s  ■  "A3s-1  "  j2  3s -2  *  Ss-l  (s  "  2<  •••>2r) 


(10.2) 
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where  6g  is  the  perturbation  induced  by  the  roundoff  error.  Again 
as  in  Section  2,  we  write 


a  =  qaq,t  ,  t  =  q  n  qt 


(10.3) 


where  Q,  is  the  set  of  orthonormalized  eigenvectors  of  A  and  T  , 
and  A  and  Cl  are  the  diagonal  matrices  of  eigenvalues  of  A  and  T  , 
respectively.  Thus  substituting  (10.3)  into  (10.2),  we  have 


£q  =  -2y  ,  =  -  g  A  Iq  +  tq 


(10.1+a) 


is  “  A  is-1  ‘  n  is -2  +  2s-l 


(10.J+b) 


where 


y  -  <iTy<r) 


T  ~ 

is  - *  Js 


T 


Because  A  and  Cl  are  diagonal,  we  may  write  an  equation  for  each 
component  of  |  ;  viz. 


+  Xj^,s  +  “j  6j,s-l  “  Tj,s  d  ~ 


(10.5) 


The  solution  of  (10. 5)  can  be  given  explicitly.  Consider  the  characteristic 
equation 


(a)  a  a2  +  +  a>j  =  0 


which  has  roots  p  and  y  ,  then 
J  J 
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(10.6a) 


_s  s 

**  -  y* 


8S_1  yS_1 

y  * 


.  Ji— . w  |  -  8  7  —M—  —V _  f 

Pj  -  7j  5j,l  Pj7j  Pd  -  7 .  5d,0 


s  -  1  .s-k  s-k 

*Y  \  -(j  Tj 

V  . ,  @3  "  J 


when  Pj  /  7 


-  s-1  .  , 

>s  *  ...  V  /.  ,  \«s-k-l 


sf>3  53,l'(s‘1)Pj53,0+^1  -  k)P®“  '  TJ)k  when  .  (10.6b) 


-\j/2u)j  =  cos  ©j  when  |Vj/2a^|  <  1 

=  cosh  Zj  when  |  X^/ScjOj  |  >  1 


Then  using  the  initial  conditions  (10.4a),  we  may  write  (10. 6a)  as 
follows : 


s  ,  S;1  o_k  1  sin  (s-k)© 

53,S  =  _2c03  oos(s  83)y3  +  ^  “3  sms,  j  T; 


when  |\,/2o).  |  <  1 

u  u 


(10.7a) 


-2cu  cosh(s  z  )y  +  V  a>S. 
J  J  J  k=0,  J 


-1  „  ,  .  sinh  (s-k)z. 

r1  s-k-1  i 


sinh  z .  ,i,k 
J 


when  |kj/2tUj|  >1  . 


(10.7b) 


Note  that 
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-S x 


cos  s  d . 


cosh  s  z . 

0 


(10.8) 


given  in  Section  3-  Thus 


s-1 


n  =  P  (A,T)y^  +  V  QS„  .  QT5. 

~s  ifc)  s*k  ~k 


(10.9) 


where 


fS  }.  . 

mJij 


m-1 
u).  x 

d 


sin  m  ©, 

k 

sin  Q . 

J 


sinh  m  z, 
*■ 

s inh  z . 

0 


when  |\j/2u)j|  <  1  and  i  =  j 


when  |\  /2o>.  |  >  1  and  i  =  j 
J  J  ' 


=  0 


for  i  /  j  . 


Therefore,  if  |\./2ai  |  >  1  ,  the  effect  of  the  roundoff 
J  J 

error  can  be  catastrophic.  However,  if  |\./2cu.|  <  1  ,  we  see  from 

(r)  (r) 

(10.9)  that  i]  may  be  a  good  numerical  approximation  to  A'  'y .  '  . 

~2r  ~  J 

We  now  apply  the  above  results  to  Poisson’s  equation  with 

Dirichlet  boundary  conditions.  For  the  five  point  difference  operator 

with  mesh  width  Ax  in  the  x-direction  and  Ay  in  the  y-direction, 

we  have 


*■.  =  -2[1  +  p2(l  -  cos  ^2)  ]  ,  to.  =  1 
J  *  J 


10-1+ 


and 


p  =  (Ax/Ay)  or  (4y/Ax) 

depending  on  how  one  orders  the  equations.  By  inspection 

I  >  i 

for  all  j  ;  and  hence  for  large  s  ,  equation  (10.1)  leads  to  a 
numerically  unstable  algorithm.  A  similar  result  holds  for  the  nine 
point  difference  approximation  to  Poisson’s  equation.  Using  the  five 
point  approximation  with  uniform  mesh  and  any  number  of  grid  points, 
equation  (10. 9)  predicts  severe  loss  of  accuracy  for  more  than  five 
contractions  on  a  CDC  66 00;  and  this  has  actually  been  observed.  As 
noted  in  Section  3#  Hockney  [6,  7]  has  combined  one  or  more  steps  of 
CORF  with  the  fast  Fourier  transform  to  produce  a  Poisson  solver.  For 
such  a  use  of  CORF  one  must  pay  careful  attention  to  the  above  results. 

The  cyclic  odd/even  reduction  method  can  be  used  successfully 
for  solving  tridiagonal  systems  of  equations.  In  that  situation,  one 
must  make  provision  for  the  fact  that  overflow  can  occur  during  the 
reduction  stages. 


10-5 


j 


11.  The  Buneman  algorithm  and  variants 


In  this  section,  we  shall  describe  in  detail  the  Buneman  algorithm 
[  2  ]  and  a  variation  of  it.  The  difference  between  the  Buneman  algorithm 
and  the  CORF  algorithm  lies  in  the  way  the  right  hand  side  is  calculated 
at  each  stage  of  the  reduction.  Henceforth,  we  shall  assume  that  in 
the  system  of  equations  (2.5)  T  »  I  ,  the  identity  matrix  of 
order  p  . 

Again  consider  the  system  of  equations  as  given  by  (2.5)  with 
k+1 

q  =  2  -1  .  After  one  stage  of  cyclic  reduction,  we  have 


ij-2  +  <2Ip  -  A  >?j  +  id+2  -  »J-1  +  h+1  -  *Z 


ij+l  ij 


(n.i) 


for  i  =  2,4,  ...,q-l  with  x  =  x  .  =  0  ,  the  null  vector.  Note  that 

~0  — 

the  right  hand  side  of  (11. l) may  be  written  as  follows: 


?j1}  *  ?J-1  +  -  %  ■  a<1)a'1J3  *  X,.x  +  Zi+1  -  <u-2> 


where  A^  =  (21^  -  A2)  . 
Let  us  define 


M  _  A-1  .(1)  _  v  +  v 

ii  *  3d  "  ij-i  i 


ijn-2?]1’  • 


(1)  .(1)  (1)  ,  (1) 

ti  =  A li  +  Sj 


(11.5) 
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jta*i  i •>  ? 


I 


After  r  reductions,  we  have  by  (3.3) 


y(r*i)  „  (y(r)  +  y(r)  ,  .  A(r)  y(r> 
~j-2r  ~j+2r 


(11.4) 


Let  us  write  in  a  fashion  similar  to  (11.3), 


y(r)  =  A(r)  p(r)  +  q(r) 


(11-5) 


Substituting  (11.5)  into  (11.4)  and  making  use  of  the  identity 
(A^)2  =  21^ -A^1^  from  (3.3),  we  have  the  following  relationships: 


Zi 


-  p<r)  -  (A (r))  Vr).  -  p(r)„  -  9‘r))  U1.&) 


~j-2r  ~j+2r  ~J 


Sj 


-  -<r>  +  0<r>  . 


4  r  4  r  ’  -*-1 
~j-2r  ~J+2r 


(11.6b) 


for  j  =  i2*,J-  (i  =  l,2,...,2k”r-l)  with 


(r)  =  (r)  =  (r)  =  (r)  = 

Jo  ?gk+l  So  Sgk+l 


To  compute  (A^)  ^(p^  +  p^  -  q^)  in  (11. 6a),  we  solve  the 

~4  nr  —  .nr  —  J 


"j-2  ~j+2 


system  of  equations 


A(r)(p'r)  -pf*1’)  -  PWr  ♦  PWr  -  q<r) 

-J  -J  ~j-2  ~j+2 


11-2 


where  A'  '  is  given  by  the  factorization  (3.10),  viz. 


■s*)  - 


-  ^  (A  +  2  cos  ©jr)  Ip)  , 


ojr^  =  (2j  -  l)n/2r+1 


After  k  reductions,  one  has  the  equation 


A<k)  x  =  A<k>  p<k)  +  qM 


and  hence 


X  k  .  ,<£>  MA^r1  qW 

~2  ~2  ~2 


Again  one  uses  the  factorization  of  A^  for  computing  (A^)-1  q^ 

“2 

In  order  to  back  solve,  we  use  the  relationship 


+  A(r)  x  +x  = 

-j_2r  ~0  ~j+2r  ~*5 


for  j  =  i2r  (i  =  1,2,  ...,2k+1"r-l)  with  x0  =  x  -  ©  • 
r  r  i  y» 

For  $  =  2  ,3*2  ,...,2  -2  ,  we  solve  the  system  of  equations 


A(r)(x.  -p'r))  =q<r)  -  (x  +  x  )  , 

“j.2  ~j+2 


(11.7) 


( r ) 

using  the  factorization  of  Av  '  ;  hence 


11-3 


(11.8) 


-  ^r)> 


Thus  the  Buneman  algorithm  (variant  l)  proceeds  as  follows: 

(r)  (r) 

1)  Compute  the  sequence  {p_|  ,  q.  ']  by  (11.6)  for 

r  =  1, .  ..,k  with  p(°^  =  ©  for  j  =  0,  ...,2k+1  ,  and 

J  ~ 

230)=?J  to 


2)  Backsolve  for  x.  using  (11.7)  and  (11.8). 


(r) 

It  is  possible  to  eliminate  the  sequence  {p.  '}  .  From  (11.6b), 

~  il 


we  note  that 


(rfi)  1  ,  (r)  (r)  (r+lK 

Ei  2  '5j-2h  Sj+2h  ; 


(11.9) 


where 


h 


Using  (11*9)  in  (11.6a)  and  modifying  the  subscripts  and  superscripts 
appropriately,  we  have 


(r+1)  .  (r) 


,  ,  (r-l)  .  (r) 

Sj-2h  -  ai-h  +  si 


(r-l) 

~d+h 


-  '  +  q;;~  + 


,(r) 

2j+2h 


(r)N-l  (r-l) 

aj-5h 


(A^O'^tqr^'-  qriu  +  q 


.(r) 

ii-2h 


(r-l)  (r) 

'  % 


+ 


(p-l) 

~j+5h 


] 


(11.10) 


for  j  =  (2r,2r+1,  ...,2k+1-2r)  with 


So 


(?)  _  „(*) 


1  y+ 1 
~2  - 


=  6 


2j 


(0) 


Sj1} "  ?.i-i +  y.i+i  - 2A'1 


for  all  r  , 


v+i 

for  1,2,  ...,2  -1  , 

for  5  -  2,k, ...,2k+1-2  . 


To  solve  for  x.  ,  we  use  the  relationships  (11.7 )  and.  (11.9)  so  that 


x  .1  fl(r-3)  +  _(r-l)  (r)} 

~.j  '  "  %-h  Vh  ■  % 


-  (A^V^x  +  x  n(r)) 

(A  '  _j-2h  *i+2h  Sj  ' 


(11 


Thus  the  Buncman  algorithm  (variant  2)  proceeds  as  follows: 

(r) 

1)  Compute  the  sequence  fq.j  '}  by  (11. 10 )  for  r  =  1,2,  ...,k  . 

2)  Backsolve  for  x.  using  (11. ll) . 

-w.t 

Note  that  the  Buneman  algorithm  (variant  2)  requires  half  the  storage 
that  the  Buneman  algorithm  (variant  l)  requires.  However,  the 
variant  2  algorithm  requires  approximately  twice  as  many  additions. 


The  p.  1  s  and  q.  ’ s  can  be  written  in  terms  of  the  x.  's.  In 
Section  15,  we  shall  show  how  this  affects  the  stability  of  the  methods 


Note 


ii 


+  A 


-x 


(jj-i +  ;J+i> 


and 


3jU)  -  i'j-i  *  ?ju  -  ^ 


■  2.1-2  -  (A)‘1  A<1>tij-i +  iW  +  *j* 


By  an  inductive  argument,  it  is  possible  to  show  that 


,r-l 


i  +  (-1)"1  S<r>  k  <*-<*-!>  +  * 


j+(2] 


and 


(-l)r  S<r>  A(r) 


(fj-(2k-l) 


+ 


where 


A(°))-i 


s(r)  =  (A(r_1)  pST'2^ 


•  •  • 


I 

I 

I 

0 


I 


I 

I 

I 

I 

I 

3 

I 


12.  Applications  of  the  Buneman  algorithm  to  Poisson’s  equation 

As  was  pointed  out  in  Section  4,  matrices  of  the  form  (2.5) 
arise  in  solving  the  five  point  finite  difference  approximation  to 
Poisson’s  equation  over  a  rectangular  region  with  Dirichlet  boundary 
conditions  and  hence  it  is  possible  to  use  the  methods  of  Section  11. 

For  the  five  point  approximation  to  Poisson’s  equation  over  a  rectangular 
region  with  Neumann  or  periodic  boundary  conditions  it  is  necessary  to 
modify  the  Buneman  algorithms. 

For  the  Neumann  boundary  conditions,  we  have  the  system  of 
equations 


*io  4  *1 


-  & 


x.  +  Ax,  +  x  =  y 

~0-l  ~J  ~j+l  ~d 


vJ  =  1,2,  ».»,m— l)  , 


2x  +  Ax  =  y 
_m-l  „m  Zm 


with  m  =  2 


k+1 


We  define 


It 


(1)  _ 


‘'V  *  !?> 


for  j  =  0, 2,  4,  . . . ,  2 


k+1 


where 


(1)  _  -1  (1)  _  ,  (1)) 

?o  A  lo  »  So  •  2(£l  ■  £0  }  ' 


.(1) 


=  a"1  y,  ,  q^  =  y,1  +  y,+1  -  2p^  (j  =  2,4,  ...,m-2)  , 


(1)  a-1  (!)  0/  (lh 

P  =  A  y  ,  q _  =  2(y  ,  -  p  ) 

£m  a  ji  im-1  Zm 


9 

I 

I 


12-1 


In  general  then,  as  in  Section  11,  we  have  for  r  =  1,2,  ...,k-l 


(r+l)  .  (r+l)  ( H-l)  ^  (r+l) 

'pj  '  +  <y 


where 


■(rtl)  =p'r)-(A(r)r1(2p(")-q'r))  ,  Jr1'  -  W?- p(rtl) 


50 


52r  *0 


JO 


~2 


r  ^0 


p'1"1’  =  pS^-tAWj-V^^P^.-^*)  ,  q^11  -  q<r)  *q 


rj  zi 


r  ir  r 

~j-2r  ~j+2r 


~.i-2 


for  j  =  i2r+1  (i  -  1,2,  ...,2k"r-l) 


im)  =pir)-(A(r,rWr,.-ir)) ,  ^  -a»(r).-?pLrtl 


~m  Sm 


'm-2 


'  m-2 


r  ;m 


Finally, 


y(W)  .  „<*!>  p(*l>  +  >♦!) 


where 


B(*U)  „  kI .  (A00}2 


j&i)  .  (k)  _  (k)  ,  „(k)  _  (k). 


From  (5*^i  we  see  that 


s'**1)*  „  =  ♦  q^ 

~2k  ^2k 


so  that 


-  p'r11  *  <»<*«>-  q'r11 


-qk  ;2k 


(B^k+1^ )"q^+1^  indicates  a  solution  to  the  singular  system 

B^k+1\x  -  p^1^)  =  .  The  factorization  of  B^^  is  given  by  (5*6). 

~'d  ~2^  ~2K 

The  backsubstitution  process  proceeds  as  in  Section  11.  It  is  also 

(r) 

possible  to  eliminate  the  sequence  as  was  done  in  the  previous 

section. 

For  periodic  boundary  conditions,  we  have  the  system  of  equations 


Ax.  +  +  x0  +  =  y, 

~1  ~2  jn  ~l 


x  x  +  AXj  +  xj.+1  =  y^  for  j  =  2,3, . .  .,m-l  , 


x.,  +  x  ..  +  Ax  =  y 

„1  „m-l  X,m 


We  define 


^  ♦  si1’ 


for  j  —  2,  •  •  • )  2 


where 


12-5 


I 


Pjj1)  =  A"ly2  »  <121^  =  yl  +  y3  "  2p21^  , 


^  >  Sj1’  ■  li-i  +  ?3+i '  ^j1’  ’  (3  =  u’6’—’m-2)  ’ 


(1)  .-1  (1)  ,  -  (1) 
?J  ■  A  2m  ’  5m  -  h  *  5.-1  -  ^ 


In  general  for  r  =  1,2,  .  ,.,k-l  , 


(r+1)  .  (r+l)  (r+1)  .  (r+l) 

yv  =  Av  'p'  '  +  q) 

~  J  ~  J  ~  J 


for  j  =  i2r+1  (i  =  1,2,  ...,2k"r) 


where 


J**1)  =  (r)  (A(r)v-l(  (r)  (r)  (r)v  (r)  (r)  (r)  ~(r+l) 

p  -r+i  -  p  hi  'a  >  vp  v  p  y  q  y.+v»  q  -r+l  -  q  r.  q  ^  > 


f2r+l  £2rH 


^  r  •  *  r  •  *  r+1"  H  r+1  "  *  r  *  r  ^  r+1 
~2  ~5x2  ~2  ~2r^J"  “2  ~3x2  ~2  X 


^  •  P'r) - (A(r))-Vr) ♦p<r>„ -*lr)) .  ^  .4w.M(r).-=p!r,1) 


^  -J+2r  .o 


*  r  r  *  i 
~J-2  ~j+2r 


5<rtl)  =  P<r)-(A(r))-V^P(r)r-5,lr))  •  *  ?(^)  +  -i(r>r-^r+1)  • 


2  ~m-2 


Finally  as  (12.1), 


(k+1)  _  (k+1)  (k+1)  (k+1) 

yir  B  P  V  +  q  v 


where 


12-4 


2  2 


and  B 


(k+1) 


is  defined  by  (12.2).  Then 


B(W)X  .  b<*«  p<J*>  ♦  ,<*« 

~2K  ~2  ~2K 


so  that 


x  „  -  P(!;+l)  +  (»««>)-  q'r1'  . 


~2k 


The  backsubstitution  process  proceeds  as  in  Section  11. 

f  r )  ( r ) 

It  is  possible  to  express  p^  '  and  q:  in  terms  of  x  as  in 

~|J  ~«J 

equations  (11.12)  and  (11.13). 
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13-  Accuracy  of  the  Buneman  Algorithms 

As  was  shown  in  Section  11,  the  Buneman  algorithms  consist  of 

(r)  (t) 

generating  the  sequence  of  vectors  {p^  ,q\  '}  .  Let  us  write  using 

~  O 

(11.12)  and  (11.13) 


Jr)  _  Jr)  +  (r) 

'-i  -gJ 


q<r>  =  X  *  X  -  A(r>  gir> 


where 


=  ('l)rfl  s(r)\  ^1(fj-(2k-l)  +  ~j+(2k-l)  /  (15*2) 


a(r)  .  {A(r-l)  _  A(0))-l 


Then 


llpf  ’  -  ;jr)ll2  <  lls(r)ILJ?ll’ 


r+*  r)|l  <  l|s(r)  A(r)|| 

~j-2  ~j+2  2 


2 


where 


v  ||  indicates  the  Euclidean  norm  of  a  vector  v  , 

2  <•* 

C  ||2  indicates  the  spectral  norm  of  a  matrix  C  ,  and 


;i’  ■  t  ifeA  • 


(13-la) 


(13.1b) 


(13.3) 


(13.4) 


(13.5) 


fti 
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Vn 


When  T  =  I  ,  we  may  redefine  the  polynomials  given  in  Section  5 
in  the  following  way.  Let 


i t  =  -a/2  > 


and  define 


\|r  =  cos  0  for  |\|r|  <1 

=  cosh  ©  for  |\|r|  >1 

Then  in  a  similar  fashion  to  (±0.8), 


p  k 
2* 


(a)  =  -2  cos(2k  cos"1  \jr)  for  |\|r|  <  1 

=  -2  cosh(2k  cosh"1  y)  for  |\|r|  >1 


Thus  for  A  =  A  , 


lls(r)IL  =TT  II(a<j))-1IL 

3*o 


=  i'i  mx  its  1(x1)r1i 
J-0  {x±}  23  1 

where  are  the  eigenvalues  of  A  .  Therefore  for  |\J  >2  , 


|ls(r*||,  =  2"rTT  [cosh  e.  i'1 

3-0  fSj) 


where 


©i  =  cosh"1^  \/z) 
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Finally, 


|S(r)  A^|L  =  2-r+1  x  max 


f«i} 


rr-i 

1 1  [cosh  2J  d  ]  x 

J=0  J 


cosh  2  0,, 


when  |\J  >2  . 


For  the  five  point  difference  approximation  to  Poisson’s  equation 
over  a  rectangular  region  with  Dirichlet  boundary  conditions 


\  -  -2(1  +  p2(l  -  cos  ^)) 


where  p  =  Ax/Ay  or  (Zy/Ax)  depending  on  the  ordering  of  the 
equations .  Thus 


0i  *  cosh“1(l  +  p2(l  -  cos  ^-)) 


which 


implies  ©1  >  1  for  all  i  .  Then 


max  [cosh  2^  0.  j”1  =  [cosh  2^  {cosh_1(l  +  p2(l  -  cos  r-rr))}] 


-1 


M 
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Thus  after  some  simplification, 


,(r)i 


-eO. 


-cO, 


'2  r-1 

j=0 


_2J  +  1Qi 


<  e 


(15.6) 


p  2  TT 

where  c  =  2  -1  ,  cosh  0^  =  1  +  p  (1  -  cos  — )  . 


A  similar  calculation  shows 


]jA(f)  s(r)„2  <  2  e  P 


(13-7) 


Therefore  from  (13.6)  we  see  that  for  large  r  ,  p(r^  will  be  a 
good  approximation  to  x.  .  And  from  (15.5)  and  (I5.7),  we  see  that 


(r)  -  (x  +  x  )|j  <  2  e  P 

~j-2r  ~j+2r  2 


so  that  the  ||q^  '|l  remains  bounded  throughout  the  calculation.  This 
explains  why  the  Buneman  algorithms  lead  to  numerically  stable  results 
for  solving  the  finite  difference  approximation  to  Poisson's  equation. 


1^.  Conclusions 

The  Appendix  contains  the  results  of  some  numerical  experiments 
involving  the  application  of  the  Buneman  algorithm  (variant  1),  the 
method  of  matrix  decomposition,  the  method  of  point  successive 
over-relaxation  (cf.  [10]),  and  the  Peaceman-Rachford  alternating 
direction  method  (cf.  [11])  to  the  five  point  finite  difference 
approximation  to  Laplace's  equation  over  a  rectangle  with  Dirichlet 
boundary  conditions.  In  these  experiments  the  Buneman  algorithm  was 
the  most  efficient  and  accurate;  however,  the  method  of  matrix 
decomposition  was  competitive  in  several  cases.  We  conclude,  therefore, 
that  the  Buneman  algorithm  and  the  method  of  matrix  decanposition 
sure  useful  methods  in  the  situations  where  they  apply. 
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Numerical  Experiments 


In  order  to  gain  computational  experience  with  the  methods  of  matrix 
decomposition  (MD)  and  the  Buneman  algorithm  (variant  l),  it  was  decided  to 
apply  the  algorithms  to  the  five  point  difference  approximation  of  Laplace's 
equation  with  Dirichlet  boundary  conditions.  In  addition,  in  order  to 
compare  these  methods  with  established  methods,  it  was  decided  to  apply 
the  methods  of  point  successive  over-relaxation  (SOR)  (cf.  [10])  and 
Peaceman-Rachford  alternating  direction  method  (PR)  (cf.  [11])  to  the 
same  problems.  We  did  not  attempt  to  determine  which  method  is  best  in 
general.  Those  interested  in  operation  counts,  variations  of  these  direct 
procedures,  and  customizing  the  direct  procedures  for  particular  problems 
are  referred  to  [ 4 ]  and  [ 7  ] . 

The  following  problems  were  chosen  so  that  the  computed  error  could 
be  detemined  exactly: 

Problem  #1,  u  =  1  ; 


Problem  42,  u  =  cos(x)  cosh(y)  ; 
Problem  43,  u  =  e  (sin(y)  +  cos(y))  ; 
Problem  44,  u  =  x5  -  10x5y2  +  5xy^  . 


u  =  <computed  solution  of  the  difference  equation> 


(mesh) 


[|  u  |  ,  1.0}  ; 


the  tabulated  error  is 


max  iu  -  ui 
(mesh)  I  d  ' 


One  should  note  that  in  many  cases  the  tabulated  error  is  the  truncation 
error  of  the  difference  equation. 
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Each  of  these  problems  was  solved  on  the  following  meshes  (includes 
boundary  points) : 

Mesh  #1  20  by  129  , 

Mesh  #2  1+0  by  129  > 

Mesh  #3  80  by  129  , 

Mesh  #4  129  by  129  , 

Let  p  =  &x/Ay  .  Each  of  the  four  problems  was  solved  on  each  of  the 
four  meshes  for  five  values  of  p  : 


i 

Ax 

1 

• 

0 

ro 

v_n 

.00025 

.01 

2 

.025 

.0025 

.1 

3 

.025 

.025 

1.0 

4 

.0025 

.025 

10.0 

5 

.00025 

.025 

100.0 

Thus  each  problem  was  solved  on  a  total  of  twenty  rectangular  regions. 
These  regions  were  chosen  such  that  the  lower  left-hand  comer  of  the 
rectangle  was  always  at  the  origin.  The  following  is  a  table  of  the 
coordinates  of  the  upper  right-hand  comers: 


Mesh  #1 

Mesh  #2 

Mesh  #3 

Mesh  #4 

pi 

(-5,  .032) 

(1.,  .032) 

(2.,  .032) 

(4.,  .032) 

p2 

(.5,  .322) 

(1.,  .322) 

(2.,  .322) 

(4.,  .322) 

p3 

(-5,  3.225) 

(1.,  3.225) 

(2.,  3.225) 

3.225) 

p4 

(.05,  3.225) 

(.1,  3.225) 

(.2,  3.225) 

(.^  3.225) 

Pt5 

(.005,  3.225) 

(.01,  3.225) 

(.02,  3.225) 

(.04,  3.225) 
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We  define 


V(pj I*  j )  =  max  (solution  of  prob  ftp  on  region  with  and  mesh  fti] 
-  min  (solution  of  prob  ftp  on  region  with  p^  and  mesh  ftj] 


Note  V(l,r,j)  =  0  for  all  i  and 
for  the  other  problems: 


V(2,i,j) 


Mesh  #1 

Mesh  ftz 

P1 

.1 

.42 

P2 

.15 

.47 

p3 

11.1 

11.4 

p4 

11.0 

11.0 

p5 

11.0 

11.0 

V(5,i,d) 

Mesh  #1 

Mesh  #2 

pl 

•59 

1.64 

Pg 

•95 

2.24 

p3 

3. ,84 

6.52 

p4 

2.56 

2.7 

2.46 

2.47 

j  .  The  following  tables  give  V 


Mesh  #5 

Mesh  #4 

1.37 

2.0 

1.44 

2.1 

16.4 

24.0 

11. 

11. 

11. 

11. 

Mesh  #3 

Mesh  #4 

6.22 

25-6 

7.84 

29.2 

17.2 

58.5 

2.97 

3.36 

2.5 

2.53 
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I 


v(M,j) 


Mesh  #1 

Mesh  #2 

Mesh  ^3 

Mesh  #4 

pl 

.018 

•77 

28.2 

323.0 

p2 

.07 

.92 

28.3 

323.0 

p3 

219-5 

400.0 

556.0 

1733.0 

ph 

22.9 

48.2 

98.2 

158.0 

Ps 

2.29 

4.82 

9-9 

16.1 

For  the  above  rectangles,  the  optimum  relaxation  factor  is  given  by 

/ 1  j  \  2 


1  +  Vl-B. 


where 


2 

cos 


2(pi  + 


and  Nj  is  the  number  of  grid  points  in  the  x-direction  of  the  j-th  mesh. 
The  initial  guess  for  SOR  and  PR  was  the  zero  vector. 

The  iteration  process  was  terminated  when 

Un+1  -  u*  -4 

max  - - —  <10 

(entire  mesh)  u 

I*/1]  >  10"5 

Optimum  PR  parameters  were  determined  by  Wachspress's  algorithm  [11] 
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for  cycles  of  length  2  .  Convergence  required 


max 

(complete  cycle) 


( 


max 

(entire  mesh) 


L  |un|  >  10-5 


un+1  -  un 


<  10 


.4 


u 


Because  of  this  convergence  criterion,  a  short  cycle  was  desirable. 
After  some  experimentation,  it  was  decided  to  use  a  cycle  length  of 
four  exclusively. 

All  problems  we^e  run  on  a  CDC  6600  (about  lU  decimal  digits  of 

accuracy) ;  the  PR,  MD,  and  Buneman  programs  all  used  the  same  tridiagonal 

system  solver.  The  Q,  matrix  and  eigenvalues  required  by  MD  were 

computed  with  the  QR  algorithm  for  symmetric  tridiagonal  matrices.  The 

T 

matrix  multiplications  (Q  y)  required  by  MD  were  performed  with  a 
machine  language  inner  product  routine  which  is  quite  efficient  and 
which  accumulates  the  inner  products  in  double  precision.  It  should  be 
noted  that  for  problems  with  uniform  mesh  spacing,  these  matrix  products 
may  be  performed  with  the  fast  Fourier  transform;  and  this  makes  MD 
competitive  in  speed  with  the  Buneman  algorithm.  However,  MD  is  capable 
of  handling  more  general  problems  such  as  those  with  non-uniform  mesh 
spacings;  in  these  cases  Q  must  be  computed  and  the  matrix  products 
performed.  Thus  the  MD  routine  used  in  this  study  gives  an  indication 
of  the  kind  of  performance  one  might  expect  with  these  more  general 
problems.  Note  also  that  the  matrix  multiplication  (Q  y)  requires 
0(qp  )  operations.  The  Buneman  algorithm  requires  a  total  of 
0(qp  logg  p)  operations.  Thus  as  p  becomes  small, 

MD  approaches  the  Buneman  algorithm  in  speed. 


The  following  tables  of  computation  times  are  normalized  by  the 


time  required  for  the  Buneman  algorithm  on  Mesh  #1: 


Computation  times  for  the  Buneman  algorithm  (variant  l)  and  MD 


Mesh  #1 

Mesh  #2 

Mesh  #3 

Mesh  #4 

Buneman 

1.0 

2.08 

4.31 

6.96 

MD 

1.18 

3.65 

it. 9 

4l.O 

Computation  time  for  FR 

(These  times  are  averages  over  all  four  problems.) 


Mesh  #1 

Mesh  #2 

Mesh  #3 

Mesh  #4 

Pi 

2.56 

5.17 

9.^3 

15.2 

P2 

4.85 

10.3 

20.6 

30.1 

p3 

5.44 

12.6 

31.2 

47.7 

p4 

2.56 

7.61 

15-2 

32.5 

p5 

2.25 

4.1+9 

8.58 

13.5 

Computation 

time  for 

SOR 

(These  times  are  averages  over  all 

four  problems.) 

Mesh  #1 

Mesh  #2 

Mesh  #3 

Mesh  #k 

pl 

U0.9 

89.O 

P2 

t5-l 

97.9 

p3 

16.1 

60.1 

(not  run) 

(not  run) 

p4 

7.58 

32.9 

P5 

7.22 

28.8 
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s 


I 

I 

If 

n 

ii 


T! 

ii 

?? 

ii 

ii 

ii 

l 

a 

i 

1 

i 

i 


Relative 

error  for  the  Buneman  a 

.lKorithm, 

Meah  #1 

Prob  1 

Prob  2 

Prob  3 

Prob  U 

Pi 

M-ii) 

7  (-9) 

6  (-9) 

3(-7) 

p2 

2 (-11) 

5(-7) 

M-7) 

2(-5) 

P3 

5 (-13) 

2  (-6) 

2  (-6) 

M-7) 

Plv 

2(-13) 

l(-8) 

l(-8) 

2(-9) 

p5 

2(-l3) 

1(-10) 

1(-10) 

2  (-U) 

Relative 

error  for 

MD.  Mesh  #1 

Prob  1 

Prob  2 

Prob  3 

Prob  4 

pl 

5  (-7) 

5  (-7) 

M-7) 

3(-7) 

p2 

l(-7) 

5(-7) 

M-7) 

2(-5> 

p3 

K-9) 

2  (-6) 

2(-6) 

M-7) 

pU 

2 (-10) 

l(-8) 

K-8) 

2(-9) 

p5 

8(-10) 

7 (-10) 

7 (-10) 

6(-10) 

Relative  error  for 

PR,  Mesh  #1 

Prob  1 

Prob  2 

pl 

2 (-11) 

7  (-9) 

p2 

2(-8) 

5(-7) 

p3 

7  (-8) 

2(-6) 

Pb 

2  (-9) 

l(-8) 

O  /  1 

Prob  3 
6(-9) 
M-7) 
2(-6) 
l(-8) 
1(-10) 


Prob  b 

5  (-7) 
2(-5) 
M-7) 
2(-9) 

2  (-11) 


Relative  error  for  SOR,  Mesh  #1 


Prob  1 

Prob  2 

Prob  3 

Prob  4 

Pl 

b(-k) 

MA) 

MA) 

3  (-6) 

P2 

l(-3) 

K-3) 

l(-3) 

2(-5) 

p3 

5(A) 

1(A) 

1(A) 

M-7) 

P4 

3(A) 

3(A) 

3(A) 

9(-6) 

P«5 

3(A) 

3(A) 

3(A) 

5(-5) 

Relative  error  for  the  Buneman  algorithm.  Mesh  #2 


Prob  1 

Prob  2 

Prob  3 

Prob  4 

pl 

M-n) 

7  (-9) 

6(-9) 

7  (-7) 

p2 

3  (-n) 

6(-7) 

M-7) 

5(-5) 

p3 

2 (-12) 

5  (-6) 

7  (-6) 

2(-6) 

p4 

3 (-13) 

5(-8) 

6(-8) 

8(-9) 

P^ 

7 (-13) 

6(- 10) 

6(-10) 

8  (-11) 

Relative  error  for  MD,  Mesh  #2 


Prob  1 

Prob  2 

Prob  3 

Prob  4 

pl 

l(-7) 

8  (-8) 

l(-7) 

8(-7) 

p2 

5  (-8) 

6(-7) 

5(-7) 

5(-5) 

p3 

2(-9) 

5  (-6) 

7  (-6) 

2(-6) 

P4 

2(-9) 

6  (-8) 

6(-8) 

8(-9) 

Ps 

5 (-10) 

K-9) 

l(-9) 

3 (-10) 

m 


I 


I 

Relative 

error  for  HI, 

Mesh  #2 

R 

Prob  1 

Prob  2 

Prob  3 

Prob  4 

n  Pi 

2 (-11) 

7  (-9) 

7(-9) 

7  (-7) 

*2 

M-8) 

6(-7) 

M-7) 

5(A) 

II  p3 

M-6) 

M-6) 

7(-6) 

2  (-6) 

TT  PU 

8(-10) 

6(-8) 

6(-8) 

8(-9) 

I  p5 

3 (-12) 

6(-10) 

6(-10) 

7  (-6) 

1 

Relative 

error  for  SOR, 

,  Mesh  #2 

0 

Prob  1 

Prob  2 

Prob  3 

Prob  4 

n 

MA) 

MA) 

3(A) 

2(A) 

li 

p2 

7(A) 

5(A) 

3(A) 

5(-5) 

n  *>3 

l(-3) 

6(A) 

2(A) 

2(-6) 

n 

5(A) 

MA) 

6(-5) 

7(-7) 

n 

f : 

MA) 

U(A) 

MA) 

8(-5) 

i! 

Relative 

error  for  the 

Buneman 

algorithm.  Mesh  #3 

Prob  1 

Prob  2 

Prob  3 

Prob  4 

i  |  Pi 

M-n) 

7  (-9) 

6  (-9) 

5  (-8) 

P2 

3 (-11) 

6(-7) 

M-7) 

5  (-6) 

0  p3 

K-n) 

8(-6) 

K-5) 

l(-5) 

n  •  Plt 

M-13) 

2(-7) 

2  (-7) 

3  (-8) 

D 

2  (-12)  ^(-9) 

3(-9) 

5  (-9) 

1 

I 
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Relative  error 


for  MD.  Mesh  #3 


P1 

p2 

p3 

p4 

p5 


Prob  1 

Prob  2 

Prob  3 

Prob  4 

l(-9) 

8  (-9) 

7  (-9) 

5(-8) 

l(-9) 

6(-7) 

4(  -7) 

5  (-6) 

8(-10) 

8  (-6) 

l(-5) 

l(-5) 

8 (-10) 

2  (-7) 

2(-7) 

3  (-8) 

8 (-10) 

3  (-9) 

3  (-9) 

7 (-10) 

Relative  error  for  PR,  Mesh  #3 


pl 

P2 

p3 

p4 

p5 


Prob  1 

K-ll) 

6(-8) 

3(-6) 

2(-7) 

K-ll) 


Prob  2 

7  (-9) 
6(-7) 
8(-6) 

3  (-7) 
3  (-9) 


Prob  3 
6(-9) 
M-7) 
K-5) 

2  (-7) 

3  (-9) 


Prob  4 
5  (-8) 
4(-6) 
l(-5) 

3  (-8) 

3 (-10) 


Relative  error  for  the  Buneman 


zor it hm.  Mesh  #4 


Prob  1  Prob  2  Prob  3  Prob  4 

Pl  4 (-11)  7 (-9)  6("9)  8("9) 

p2  3  (-H)  K-7)  M-7)  7  (-7) 

p5  3  (-H)  8(-6)  2(-5)  l(-5) 

p4  l(-12)  5(-7)  6(-7)  8("8) 

Pc  4(-12)  6(-9)  7  (-9)  8<-J0) 
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